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A large number of papers have been published attempting to give some 
analytical basis for the performance of Turbo-codes. It has been shown that 
performance improves with increased interleaver length. Also procedures have 
been given to pick the best constituent recursive systematic convolutional codes 
(RSCC’s). However testing by computer simulation is still required to verify 
these results. This thesis begins by describing the encoding and decoding 
schemes used. Next simulation results on several memory 4 RSCC’s are shown. 
It is found that the best BER performance at low Eb/N 0 is not given by the 
RSCC’s that were found using the analytic techniques given so far. Next the 
results are given from simulations using a smaller memory RSCC for one of the 
constituent encoders. Significant reduction in decoding complexity is obtained 
with minimal loss in performance. Simulation results are then given for a rate 1/3 
Turbo-code with the result that this code performed as well as a rate Vi Turbo- 
code as measured by the distance from their respective Shannon limits. Finally 
the results of simulations where an inaccurate noise variance measurement was 
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From this it is observed that Turbo-decoding is fairly stable with 


used are given, 
regard to noise variance measurement. 
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Chapter 1 


Introduction 


Low bit error rates (BER) in high noise environments have required the 
use of very complex channel coding and decoding schemes. According to 
Shannon’s theorem very long random codes can approach Shannon’s limit [1]. 
This limit is defined as zero probability of bit error (usually this is taken as BER 
of 10' 5 or some other convenient figure of merit) when the Eb/N 0 is larger than 
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Figure 1 . 1 The Limits for Reliable Communication 
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a given value which depends on the rate of the code. Eb/N 0 required for given 
rates is shown in Figure 1.1 assuming no intersymbol interference, and minimum 
Nyquist bandwidth [2]. However, long random codes are, in general, extremely 
difficult to decode. In order to decrease the complexity of the decoder several 
approaches have been tried. A typical practice, introduced by Fomey [3], is the 
concatenation of more than one code. This method is composed of coding the 
information bits by an outer encoder and inputting the output of the outer encoder 
into a second inner encoder which is then output to the channel. The bits can be 
decoded by decoding the output of the channel by the inner decoder first and using 
that as an input to the outer decoder. A typical example of this would be a Reed 
Solomon code as an outer code with a convolutional code as the inner code as 
shown in Figure 1.2. 



Figure 1 .2 A Serial Concatenated Scheme 
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Recently a new concatenation scheme has been proposed. This scheme is 
called parallel concatenation. Parallel concatenation is done by encoding 
information streams that are linked through a pseudo-random interleaver as shown 
in Figure 1.3. Delays are not shown in the figure. The input to the interleaver is 
presented as blocks of bits. The process of using parallel concatenation in 
conjunction with recursive systematic convolution codes (RSCC’s) has produced 
codes, nicknamed Turbo-codes [4], that have phenomenal error correcting 
capacity at very low bit energy to noise variance ratios (Eb/N 0 ). For example the 
rate Vi code (accomplished by puncturing every other bit from each RSCC output) 
in [4] was found to have a BER of 10" 5 at Eb/N 0 of only .7 dB. This is a savings of 
about 9 dB over uncoded BPSK which is shown in Figure 1.4, but more 
importantly it is within .7 dB of the Shannon limit for a rate Vi code (see Figure 
1 . 1 ). 

dk d k 



Figure 1.3 The General Encoding Scheme for Turbo-codes. 
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While these codes have very good BER performance there are some difficulties 
with these codes. One of the problems is the fact that the decoding of these codes 
requires soft outputs. The optimal decoding algorithm, the Maximum A posteriori 
Probability (MAP) algorithm is very complex due to the number of operations 
needed and the amount 



Figure 1 .4 BER vs Et/N 0 for Uncoded BPSK 
of memory required. There are simpler decoders, such as the Soft Output Viterbi 
Algorithm (SOVA) and the Max-Log MAP, but they are both sub-optimal 
algorithms. 

One of the objectives of this research is to investigate the effects of using 
different generators for the RSCC’s on the performance of the Turbo-codes. This 
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will be done using computer simulation. While several analytical methods have 
been proposed for choosing proper RSCC’s in the Turbo-code system they have 
not all been tested by computer simulation. Computer simulation is necessary to 
confirm results that were given by analytical methods. Also it has been seen that 
concatenating a smaller memory convolutional encoder with a memory four 
convolutional code does not degrade performance levels very much, while 
decoding is less complicated [5]. The performance of these schemes will be 
evaluated. In cases where bandwidth is not a concern but power is limited, lower 
rate encoding schemes can be of use. Simulations will be run to determine if the 
performance of a lower rate (1/3) Turbo-code scheme generates good results. The 
results of the rate 1/3 code will be compared with the rate Vi Turbo-code scheme. 
Also the effects of inaccurate measurement of noise variance on Turbo-code 
performance will be investigated (the MAP decoder requires an estimate of noise 
variance). This is done to see how stable the Turbo-decoding process is in the 
case when noise variance is measured inaccurately. 

This thesis will begin by describing the general encoding scheme. Then 
detailed descriptions of the encoding components of Turbo-codes including 
descriptions of the constuction of RSCC’s and the interleavers, as well as 
motivations for their use, will be given. Next will be the description of the 
decoding process beginning with a description of the soft output decoders 
(specifically the MAP algorithm) and then describing the Turbo-decoding process. 
Finally the research findings will be presented. 





Chapter 2 


Overview of Encoding Components 
2.1 General Overview 

Most Turbo-codes are encoded by concatenating two RSCC’s through an 
interleaver. A block of message bits is encoded with a RSCC. That same block of 
message bits is interleaved by a pseudo-random interleaver and encoded with 
another RSCC (see Fig. 1.3). The systematic information is sent only once, not 
separately with each RSCC. 

The reasons that this channel coding scheme works so well are that it 
combines three different areas that help to produce good codes [6]. The three 
areas are: 

combining several codes by concatenation 

maximum use of channel information (i.e. soft decoding) 

random like distribution of codewords 

The purpose of this chapter is to show how Turbo-codes use the 
RSCC’s and the interleaver to mimic random codes in some ways. Soft decoding 
algorithms will be discussed in chapter 3. 

It was shown by Shannon that large random codes can decode near the 
Shannon limit. This suggests that good codes should have a distance distribution 
that mimics that of random coding rather than simply having a large minimum 
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distance. The weight distribution histogram of a fixed length random block code 
would be very close to a binomial distribution. It would have very few low 
weight or high weight codewords, and the majority of the codewords would have 
a weight very close to the middle of the weight spectrum. Designing such codes 
with enough structure to decode with a reasonable amount of complexity and 
arbitrary parameters (i.e. length, rate) is not possible yet. However Turbo-codes 
are able to generate a weight distribution that has been shown to have a 
distribution with a shape similar to that of random codes. The following sections 
will detail how each component of the Turbo-encoder allows Turbo-codes to 
mimic random codes. 


2.2 Recursive Systematic Convolution Codes 

This section will begin with an example of a non-systematic convolutional 
code (NSCC). From there it is shown how to construct RSCC’s and some of the 
properties of RSCC’s are given. 

The structural sequences of channel coding have been classified into two 
main categories, block and convolutional encoding. Block coding is performed 
by accepting a given number of bits (k) and using algebraic rules to form a 
number of parity bits (p). When the information is transmitted the parity bits are 
tacked onto the information bits. The total rate of the code, k/n, is given as the 
number of information bits (k) divided by the total number of bits sent (k+p). 
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Usually convolutional encoding is done by accepting bits serially, one bit 
at a time through m tapped delay lines (a more general procedure is shown in [7]). 
This means that the output bits will not only depend on the current input bit but 
will also depend on at least the previous m input bits. An (n, k = 1, m) 
convolutional code can be implemented that accepts 1 input bit at a time, has n 
output linear sequential circuits with input memory of order m. An example of a 
(2, 1, 2) nonsystematic encoder is shown in Figure 2.2.1. One way to think about 
the output of the convolutional encoder is to consider the output to an impulse 
when the encoder is in the zero state. The impulse 



Figure 2.2.1 A Non-Systematic Convolutional Encoder 


response of the system can be used to obtain a semi-infinite generator matrix due 
to the linearity of the response. The generator matrix, G, of the circuit shown is 
given in Figure 2.2.2. Notice that the output of the first row is the impulse 
response of the system (11 10 11). The generator bits are grouped in pairs 

of two. The first number is from y^ and the second number is from y 2 k- One 
way to generate the output for a given input sequence, {d k }, is to multiply the row 
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vector by the generator matrix, remembering that addition is done modulo 2. 
Thus, if d = [10 1] then the output is given by 
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Figure 2.2.2 A Generator Matrix 


That the output of a convolutional encoder is dependant not only on the 
current input but also the previous m inputs, suggests that we can gain insights 
into the properties of a convolutional encoder with a state diagram. A state 
diagram for the encoding circuit in Figure 2.2.1 is shown in Figure 2.2.3. This 
diagram can be important for determining some of the distance properties of 
convolutional codes. These distance properties can give information about how 
well a given code will perform. The state diagram shows the states (0, 1, 2, 3), 
the inputs and the outputs they cause. For example if the encoder was in state 2 
and a 1 was received, the next state would be state three and the output at that 
time would be ( 0 1 ). 
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Figure 2.2.3 The Input-Output State Diagram of the NSCC 

Usually the most important distance measure for convolutional codes is the 
minimum free distance. This is defined as [7] 

dfree = min{d(v’,v”):uVu” } 

where v’ and v” are the codewords corresponding to the input vectors u’ and u” 
respectively (df ree is not related to { dk } which was defined as the input sequence). 
This means that d^ is the minimum distance between any two codewords in the 
code. Another way of saying this is that the free distance of a code is the number 
of bits that need to be changed in a given word for the output to be a different 
codeword. This is important for determining the error correcting ability of a code. 

The example given is for a NSCC. However RSCC’s have been 
discovered which perform better than the best NSCC’s at any SNR for high code 
rate (rate > 2/3) [8]. These encoders are constructed from NSCC’s by using a 
feedback loop and setting one of the outputs, y*, equal to the input, d*. Since the 
output of these codes is separated into the systematic portion of the output and the 
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other portion, the other portion will be called the parity sequence and the parity 
bit at time k will be denoted by pk- An example of a RSCC is shown in Figure 
2.2.4 with the state diagram of this encoder given in Figure 2.2.5. 

The generator given in Figure 2.2.4 is called a 5_7 RSCC. The 5 and 7 
represent octal numbers that are converted to binary to represent the connections 
in a generator circuit. The first number will be called the FB (feedback) 
connection, while the second will be called the FF (feedforward) connection. 

It was claimed that these codes perform better than the NSCC’s at high code rates. 
A high code rate is accomplished by puncturing the outputs of the convolutional 
encoder. This means systematically deleting some of the output bits. While 
puncturing can be done in different ways, it is usually done by eliminating every 
other bit out of the non-systematic portion (pk in Figure 2.2.4) and will be done 
this way for the remainder of this thesis. For this punctured code the rate would 
then be 2/3 (1 information bit transmitted for every Wi bits transmitted). For 
Turbo-codes the overall rate has generally been Vi by using two punctured 


dk 



Figure 2.2.4 A Recursive Systematic Convolutional Encoder 
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Figure 2.2.5 Input-Output State Diagram of the RSCC 


RSCC’s and transmitting the systematic portion only once. 

The reason that RSCC’s are important is that they have been found to give 
the greatest gain when used as the parallel concatenated codes [2] (it has been 
shown that NSCC’s give almost no gain when constructed as Turbo-codes). One 
of the ways that they can be seen to be different from the NSCC’s is that a finite 
weight input sequence can be mapped into an infinite weight output sequence. 
This is shown by the impulse response of the encoder of Figure 2.2.4 which is p k 
= [1110110 110110 ...]. Notice that after the first parity bit the 
sequence repeats itself with a period of 3 bits. In general the impulse response of 
a well designed memory m RSCC will repeat itself after 2 m - 1 bits. A 
nonrecursive NSCC maps a finite weight input sequence into a finite weight 
output sequence. Since one of the goals is to make the codewords have a random 
distribution and since the output weight of a nonrecursive NSCC is somewhat 
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correlated with its’ input weight, using NSCC’s would not be as good for 
designing random like codes. 

[6] showed that for most input sequences the output weight of RSCC’s has 
the same distribution as that of a random code sequence. While most input 
sequences will have an output weight that approximates that of random sequences 
there are input sequences that cause low output weights. For example, there are 
sequences with as few as two ones that will cause the encoder to go from the zero 
state to a nonzero state and back and generate low weight codewords. For the 
encoder of Figure 2.2.4 a sequence that would do this is d k = [10010000...]. 
The parity output for this sequence is p k = [1 1 1 1 0 0 0 0...]. This means that 
any sequence that is a shifted version of the one mentioned will have an output 
weight of 6. These codewords are examples of the codewords that cause the 
codes to perform poorly. The object of encoding of Turbo-codes through an 
interleaver is to “boost” the low output weight codewords that would be generated 
by a single RSCC. In other words what the interleaver is designed to do is to 
force most of those input words that produce low weight output codewords 
through RSCC1 (i.e. few ones in pi k ) to produce higher weight codewords 
through RSCC2 (p 2 k)- 

When decoding convolutional codes it is desirable to force the encoder 
into a known final state to protect the final few information bits. RSCC’s cannot 
be driven to the all zero state by adding a specific number of zeros (this can be 
seen in the state diagram, Figure 2.2.5) as can be done with NSCC’s. Some 
simple, sub-optimal solutions to this are to fail to protect the final bits sent in a 
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block by appending no bits onto the end. This way neither the final state of the 
encoder or the final bits are known. Another choice that can be made is to force 
the encoder into the all zero state by a proper choice of m (where m is the 
encoder memory) end bits. This allows the decoder to know that it is in the all 
zero state while not knowing the final m bits. 

Choosing the best RSCC generators for Turbo-codes has been done by 
several methods. One method that has been used to determine the best generators 
is using the encoder with the best distance properties [8]. Another method is 
given in [9]. This method involves using a primitive polynomial as the FB 
connection and determining the FF connections based on the resulting BER. That 
paper also lists several good generators. 

2.3 Interleavers 

The use of a good interleaver is the most important factor in achieving the 
best possible performance of Turbo-codes [10]. The interleaver permutes the 
information bits in such a way as to make the output of RSCC2 (from Figure 1.3) 
appear to be independent of the information sequence and therefore random-like, 
but to have a structure that permits decoding. While the mechanics of what 
exactly makes up the best psuedo-random interleavers is not completely 
understood, and the mathematics needed to analyze them is somewhat difficult, 
there have been some investigations that give heuristic ideas as to why random 
interleavers work [10], Also it has been found that good interleavers for Turbo- 
codes are not hard to find [11]. This section will discuss a procedure for creating 
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a pseudo-random interleaver and also show why nonrandom block interleavers do 
not work well in Turbo-codes [10]. 

In this discussion nonrandom block interleavers will refer to a structure 
that reads bits in through the rows and out by the columns. Pseudo-random and 
random interleavers will be referred to when discussing block interleavers that 
read bits in through the rows but are read out using some other method. 

Interleavers had been used prior to Turbo-codes in order to break up 
patterns of errors in bursty channels. To do this a nonrandom block interleaver 
would often be used. As mentioned, in this type of interleaver the bits would be 
read in by rows and read out by columns. In this way a sequence of the form 

do, di, d 2 , d3, d*, ds, d6, d7, ds, d9, dio, dn, di 2 , dn, du, dis 
that was read into a four by four square matrix would be read out as 

do, d4, dg, di 2 , di, ds, d9, d^, d 2 , do, dio, d|4, d3, d7, dn, dis 
Although this sequence has been mixed up, it does not appear random to the 
channel. It can be seen that if a sequence is correlated then this interleaving 
procedure will change the correlation in a uniform way. 

One procedure for creating a pseudo-random interleaver is given in [10]. 
The procedure is as follows: for an M*M memory (where M is a power of 2) the 
bits to be interleaved are read into a square matrix. If i and j are the addresses of 
the row and column for writing, respectively (with the first row and column being 
labeled row 0 and column 0 respectively) and i r and j r are the row and column for 
reading, respectively then the rule for reading is 


i r = (M/2 + 1 )(i + j) mod M 
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E = (i + j) mod 4 

j r = [P(E) * (j + 1)] -1 mod M 

P(E) is a function of E that is relatively prime with M and is a function of the row 
address (i + j) mod 4. P(E) is given as follows: 

P(0) =17; P(l) = 37; P(2)=19; P(3) = 29; 

P(4) = 41; P(5) = 23; P(6)=13; P(7) = 7; 

(The only difference between this interleaver algorithm and the one used in our 
simulations is that the row address E is taken modulus 8 for a 256x256 
interleaver). The sequence 

do, d|, d2, d3, dj, ds, ck, d 7 , ds, d9, dio, dn, di2, di3, du, dis 
will now be interleaved by this random interleaver. The output is given by 
do, di 3 , dg, d 7 , d]2, d9, do, d 3 , dio, ds, d2, dis, dj, di, dj 4 , dn 
While the output from this interleaved pattern is not random per se, it does appear, 
at first glance, to be more “random” than the previous interleaver. However it is 
difficult to say how random an interleaver looks, especially for small blocks. 
Right now the only way to test whether an interleaver is random enough in a 
Turbo-code scheme is to run simulations with it. Deinterleaving is the inverse 
function of interleaving. 

The reason that random interleavers work in Turbo-coding schemes is 
because they better “imitate” a random sequence to the channel. Since the goal of 
Turbo-codes is to create somewhat random codewords (as given by their output 
weight distribution) for a given input codeword, it can be seen that an output 
sequence that is only distantly related to its’ input would be desirable. This means 
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that the output of RSCC2, p 2k from Figure 1.3 should be nearly independent from 
the sequence d k . 

Some analysis of the distance properties of nonrandom block interleaved 
sequences is given in Appendix 1 . It is shown that nonrandom block interleavers 
can produce output sequences with high weights for input sequences with weights 
2 or 3. But for input sequences of weight 4 this is not necessarily the case. This 


motivates the need for random interleavers. 



Chapter 3 


Soft Output Decodine 

3.1 Overview of Soft Decoding 

One of the factors that makes Turbo-codes work well was discussed in the 
previous chapter (approximating random codes). In this chapter it is shown how 
all the information from the channel is used. To do this soft output decoding is 
needed. This allows information to be passed from one decoder to another 
without loss of information. This requires a more complicated decoding system 
than is usually used with convolutional codes. Several algorithms have been 
proposed to generate the soft decisions. The Maximum-A-posteriori Probability 
(MAP) algorithm [12] is the optimal algorithm and will be discussed extensively 
in section 3.2. The Max -Log MAP [13], a simplification of the MAP algorithm, 
and the Soft Output Viterbi Algorithm (SOVA) [14] will also be discussed briefly. 
After an example of the MAP algorithm is given in section 3.3, the procedure for 
decoding Turbo-codes will be discussed in sections 3.4-3. 6. 

3.2 MAP Algorithm 

The MAP algorithm is the optimal algorithm for the minimization of 
probability of bit error. The algorithm can also generate the probabilities of a bit 
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being 1 or 0. This is important because it is used to give a reliability value by 
using the log-likelihood value of a bit d k , A(d k ) = log(Pr(d k = 1)/Pr(d k — 0)). 
Pr(d k = i) is the probability that the decoded bit d k = i (i = 0, 1). This A(d k ) is used 
to determine a soft output value. The sign of A(d k ) determines whether the bit is a 
zero or one while the magnitude determines the reliability of the decoded bit. The 
log function is the natural logarithm (base e). The notation used in this derivation 
is as follows. R k i k2 is the received sequence from states at time kl to time k2. 
This is an encoded sequence that has been corrupted by noise. R./ is the entire 
received sequence from time 1 to time f. R k is the received information at time 
unit k. Sk is the state of the encoder at time unit k. The value of the state at time 
k, Sk, is denoted by m, while the value of the state at time k-1, Sk-i, is denoted by 
m’. M is the total number of states. Hence m, m’ = 0, 1, ..., M-l. It will be 
assumed that the encoder starts in the zero state. 

As stated, the MAP algorithm gives the decision for every bit (i.e. 0 or 1) 
and a reliability value for the bit (higher reliability’s being more reliable) given 
that all bits have been received. Mathematically this can be done by finding the 
probabilities of all state transitions. To do this we find 

Pr{S k -i = m’; S k = m I R/ } (3.2.1) 

Since this form is more difficult to work with, it is converted to an equivalent 
form 


Pr{S k -i = m’; S k = m ; R, f }/ Pr{ R, f } 


(3.2.2) 
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The equivalence between (3.2.1) and (3.2.2) is given by Bayes rule. Since Pr{Ri } 
is a constant for a given received sequence only the numerator of (3.2.2) needs to 
be found. The following notation is introduced to allow for ease of exposition. 
Ci c (m,m’) = Pr{Sn = m’; S k = m; R/} (3.2.3) 

The probability that a bit is zero or one can be determined from (3.2.3) as: 
Pr { d k = i } = £<7 k (m\m) (3.2.4) 

(m,m)€A*(t) 

where A k (i) is the set of state transitions that cause the output i at time k. 

The essential idea of decoding a bit is to split the probability that a state 
transition has occurred into three portions. The first part is developed from the 
received information prior to the time of the state transition. The second portion 
is formed from the received information after the state transition. The third 
portion is based on the received information at the time of the state transition. 
This can be expressed symbolically by introducing the following symbols. 
a k (m) = Pr{S k = m, Ri k } (3.2.5) 

p k (m) = Pr{ R k+1 f IS k = m} (3.2.6) 

Yi(R k ,m,m’) = Pr { d k = i, S k = m, R k I S k _, = m’} (3.2.7) 

Assuming that tiny state transition is described by a Markov process the 
value of G k (m,m’) is given by 

d k (m,m’) = a k .j(m) * y(R k , m, in’) * p k (m). (3.2.8) 

What (3.2.8) has shown is that the transition probability, d k (m,m’), can be 
broken up into those determined by the first k-1 transitions, the final (f - k) 
transitions and the transition determined at time k. This is important because the 
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transitions determined by ct k (m) and |3 k (m) can be calculated recursively with the 
following formulas [12] 

0Ck(m) = X X (3.2.9) 

m ' i=0 

P k (m)= £X 7i(** + i (3.2.10) 

m ' t=0 

Sometimes the Yi(Rk, m', ni) values are not probability values but are distribution 

values (as will be seen in the example). The a k (m) and p k (m) will then need to be 

normalized as follows. 

XX Y i (R k ,m\m)*a k _ ] (m) 

a k (m) = (3.2. 1 1 ) 

XXI 

m m ' f=0 


XX y>( R k+ 

Pk(m) = -01—&L- (3.2.12) 

XXX y,(R k+i ,m\m)*a k (m') 

m rr i=0 

Since the probabilities at the first state are known (the encoder begins in the zero 
state) the ot k (m) can be calculated recursively from 1 to f. As soon as all the 
a k (m) are calculated, the p k (m) can be calculated from the final bit back to the 
first. 

With this information o k (m,m T ) can be determined. Knowing G k (m,m’) 


allows for the calculation of the log likelihood value, A(d k ) which is 
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££ y l (R t ,m\m)*0 t (m)*a k ^{m') 
A(dk) = Log- 2 — 


(3.2.13) 


m\m)* p k (m)* a k _ x {m') 

m m ' 

The essence of the algorithm is the use of probabilities to decode bits as 
opposed to the Viterbi algorithm, which uses metric values. The MAP algorithm, 
given the probabilities that the encoder is in a state at time zero, and the received 
channel values, calculates the probabilities of the encoder being in any state at any 
time recursively. All of these OCk(m) have to be stored for all values of k, and m 
(for the decoder that achieved BER 10’ 5 in [3] at Eb/N 0 .7 dB, k and m are 
approximately 65000 and 16 respectively). A similar process is used to find the 
Pic(m) after the entire sequence has been received. With these parameters the 
probabilities that the encoder was in any state can be derived and, along with the 
received channel value, is used to find the log likelihood probability. 


3.3 Example of the MAP Algorithm 

A simple example of the use of the MAP algorithm will now be given. 
The example will be done using a (5, 7) octal generator (Figure 2.2.4). For this 
generator, parity bit outputs and state transitions are given by the state transition 
diagram of Figure 2.2.5. 

Ten random bits have been generated and the output of the encoder is 

data bits { d k } : [00 1000 1000] 

parity bits {p k }: [00 1 1 1 000 1 1] 
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Turbo-codes are usually punctured to increase the rate of the total code. When 
this is done certain bits are deleted according to a given rule. Every other parity 
bit is not sent in this example. The information is sent over an AWGN channel. 
To make decoding simpler to understand the received bits are transformed by a 
linear transformation by the modulator, therefore the inputs to the decoder are x k = 
((2*d k - 1) + noise) and y k = ((2*p k -1) + noise). The bits which have been 
deleted by puncturing are inserted as zeros. This is what happened to our received 
data with noise variance of 1.6: 

{ x k } : [-1.04 -1.14 1.73 -1.48 -.02-1.49 -.53 -1.71 -1.94 -2.37] 

{ y k } : [-.70 0 -.23 0 1.78 0 -.59 0 1.53 0 ] 

Errors have occurred in the 7th column of the systematic bits and the third column 
of the parity bits. 

The decoding procedure can now be implemented. The first step is to 
calculate a k (m) = Pr{S k = m, R] k } for all states and times. Knowing that the 
encoder began in the zero state allows us to know that Oo(O) = 1 while oto(m) = 0 
for m not equal to 0. From this and the received values the rest of the a k (m) can 
be calculated. They are 


k = 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

state 3 

[0 

0 

.02 

.03 

.80 

.12 

.03 

.33 

.19 

.63 

.02] 

state 2 

[0 

.02 

.07 

.83 

.11 

.03 

.77 

.18 

.36 

.02 

.32] 

state 1 

[0 

0 

.00 

.11 

.06 

.79 

.12 

.36 

.32 

.33 

.63] 

state 0 

[1 

.98 

.91 

.03 

.03 

.06 

.08 

.13 

.14 

.02 

.02] 
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From column k = 0 to column k = 1 the probability that the encoder went 
to state 0 at the time after the first bit had arrived is the sum of the probabilities of 
transitioning to state 0 from any previous state that could possibly come to state 0. 
The only two states that can arrive at state 0 (from the state diagram. Figure 2.2.5) 
are states zero and one. Therefore the probability of being in state 0 at time 1 
(after the first systematic bit and parity bit arrive) is 
[Pr{S 0 =0,R,‘} * Pr{d,=0,St=0,Ri I S 0 =0}] + [Pr{So=l ,R, 1 ) * Pr{d,=l,S,=0,R, I 
S 0 =l }] = oto(0) * Yo(Ri,0,0) + Oto(l) * Yi(Ri,0,1). 

Since the probability of being in state 1 at time zero (Oio(l) ) is zero this 
leaves only the first portion (oco(0) * Yo(RiAO)) to be considered. Using the fact 
that the information was sent over a Gaussian channel Yo(Ri AO) is calculated by 
the following formula: 

Yi(Rk ,m,m’) = constant * exp[-(Xk - b s (i,m’,m)) 2 /N 0 ] * exp[-(yk - b p (i,m’,m)) 2 /N 0 ] 

(3.3.1) for each pair of states which allow a transition.. I chose to leave 
the constant as one and normalize the a and P values after a and P are calculated 
at any state (this is done by equation 3.2.11 automatically). b s (i,m’,m) is the 
systematic bit output at the modulator when there is a transition from state m’ to 
state m. Likewise b p (i,m’,m) is the parity bit output from the modulator when 
there is a transition from state m’ to state m. As an example, if it is assumed that 
the encoder has gone from state 3 to state 1 at time k, then dk would be 0 and pk 
would be 1. Since the modulator transforms these outputs by the linear 
transformation given above b s (i = 0,m’ = 3, m = 1) = -1 while b p (i = 0,m’ = 3,m = 
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1) = 1. N 0 is the noise variance (in this case the noise variance is 1.6). Xj is the 
systematic bit received at time k = 1 which is -1.04. yi is the parity bit that has 
been received at time 1. This is -.70. This means the transition probability 
(yo(Ri,0,0)) is 

exp[-(-1.04 - (-1)) 2 /No)* exp[-(-.7 - (-1)) 2 /N 0 ) = .99 * .95 = .98 
Similarly the transition from state 0 to state 2 (yi(R|,2, 0)) is 
exp[-(-1.04 - (+1)) 2 /No]* exp[-(-.7 - (+1)) 2 /N 0 ) = .09 * .19 =.02 
The rest of the cXk(m) can be calculated in the same way. 

(3k(m) are calculated in a similar way. However after the final bit has 
arrived the final state of the encoder is not known. For this reason M m ) can 
either be initialized as aio(m) or given equal weighting as (1/M). I have chosen to 
use the former method. (5k(m) is then 


k = 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

state 

3 

[.23 

0.17 

.10 

.07 

.26 

.44 

.23 

.01 

.34 

.63 

.02] 

state 

2 

[.16 

.10 

.58 

.26 

.44 

.24 

.26 

.33 

.66 

.03 

.32] 

state 

1 

[.20 

.51 

.18 

.43 

.06 

.26 

.44 

.64 

.00 

.33 

.63] 

state 

0 

[.41 

.22 

.19 

.24 

.23 

.06 

.06 

.02 

.00 

.19 

.02] 


Using (^9(0) for an example of how to calculate the Pk(m) will now be 
done. p 9 (0) is the probability that the sequence after time 9 (i.e. the last bit ) 
would arrive given that the state is known to be state zero at that time. For this 


case we know that the sequence could only go to state 0 or to state 2. p 9 (0) is 
(MO) * Yo(R.o,0,0)) + (M2) * Yi(Rio,2,0)). 



26 


To calculate y 0 (Rio,0,0) and y,(Ri 0 ,2,0) we use the same method as before. 

Yo(Rio,0,0) = expR-2.37 - (-1)) 2 /N 0 ] * exp[-(0 - (-1)) 2 /N 0 ] = .33 * .55 
Yi(R 10 .0,2) = exp[-(-2.37 - 1) 2 /N 0 ] * exp[-(0 - 1) 2 /N 0 ] = .001 * .55 

So that p 9 (0) is .02 * .18+ .32 * .005. At this point you may notice that the 

sum of these does not come to .19. This is because the y have not been 
normalized. This is why after all |3 9 have been calculated in the way that was just 
described the values are normalized (this is from 3.2.12). Continuing this way 
through for each of the received bits generates all the values of fl k (m) for all k and 
m although (3 k can be discarded after it has been used for generating the output 
value at time k if lack of memory is a problem. 

This information (a k and (3 k ) has been generated to obtain the probability 
values of each transition so that the probabilities that each bit was either a 1 or 0 
can be calculated using (3.2.4). Because we only need the ratio of the 
probabilities to generate the log likelihood value we will not need to find the 
probability per se. As an example I will find logarithm of the ratio of the 
probability the first bit was a one to the probability the first bit was a zero. 

The only transitions that can occur with the arrival of the first bit are the 
transition from state 0 to state 0 (which generates a 0) and the transition from state 
0 to state 2 (which generates a 1). Therefore the probability that this output is a 
one is given by Oi(0,2) = oto(0) * Yi( r i>°> 2) * Pi(2). The probability that the 
output is zero is O 0 (0,0) = oto(0) * Yo(Ri,0,0) * p t (0). Taking the ratio of these 
values and then the logarithm gives a value of -4.67. Since the sign of the bit is 
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negative it has been decoded correctly. The reliability value of 4.67 can give 
information about the actual probability of a bit being 1 or 0 if that is desired. 
Here is the complete decoded sequence. 

[-4.67 -2.2 5.13 -6.15 -3.77 -6.15 1.66 -6.0 -7.1 -7.8] 

As can seen by comparing this with the original sequence the sequence has been 
decoded correctly and the certainty of each bit can be measured relative to the 
others. 

The disadvantages of this system are now apparent. There is a very large 
amount of memory needed for decoding (storage of a). Also the complexity of 
the decoder is apparent from the equations needed to calculate the parameters 
(large numbers of multiplies and adds). 

The Soft Output Viterbi Algorithm [14] and the Log-MAP algorithm [13] 
will now be discussed breifly. 

The SOVA is generally similar to the standard Viterbi Algorithm in that it 
compares metric values at each node of the trellis to decide which path is the 
maximum likelihood path (hence the minimum metric). The SOVA at each node 
will also compare the path with the minimum valued metric with the path with the 
second best metric, and use that information to update a reliability value of all bits 
which are not the same in the two paths. This requires only comparisons of 
metrics and table lookups, which are less time consuming than the MAP 
algorithm. Also only one pass through the information is required as opposed to 
the MAP algorithm, which requires a forward and a backward pass. 
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The Max-Log MAP algorithm is a simplification of the MAP algorithm 
that results from taking the log of the probability distribution of the transitions (y) 
and replacing them by approximations. This algorithm is a better approximation 
than the SOVA but not as good as the MAP algorithm. 

3.4 Decoding of Turbo-codes 

The general scheme for the decoding is shown in Figure 3.4.1. As soon as 
the sequence is received the parity bits are demultiplexed. A soft output decoder 
is used with the inputs being the systematic information and the output of the first 
RSCC (dk and pik after modulation and having noise added, producing x k and y ik 
respectively). The output of this decoder is an estimate of the information 
sequence and will be called Al. This estimate is then interleaved according to the 
pseudorandom interleaver that was used at the encoding stage. This allows the 
new estimate Al to be used along with the parity bits from the second recursive 
convolutional code in a second soft output decoder. This produces a new estimate 
of the (interleaved) information bits. However, because the first decoder did not 
use all the information available (specifically it did not use the second set of parity 
bits, y 2 k) the performance can be improved by adding a feedback path from the 
output of the second soft output decoder to the first decoder as shown in Figure 


3.4.2. 
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y2k 


DEINTERLEAVE 


Figure 3.4. 1 The General (Suboptimal) Turbo-decoding Structure 


One important consideration when feeding information from the second 
decoder back to the first is that the information sent back to the DEC1 must be 
information that is independent of the information generated by DEC 1 in the first 
place. It should be information that was generated by y 2 k- If the information sent 
back to DEC1 was already generated by DEC1 there would be positive feedback 
and the decoding could become unstable. There are two methods for feeding 
back information. The first method is from [4] and the second from [5]. The 
first method uses slightly different decoding stuctures for DEC1 and DEC2. The 
second method has the same decoding structure for both decoding blocks. 

3.5 Method 1 for the Decoding of Turbo-codes 

The first method is achieved by considering the output of the first MAP 


decoder (DEC1) which is 
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XX k o *«*-,(»*’) 

A l(d k ) = log-*-* — 


X X K> (^* . ■ m ■ w) * A (w) * «t-l ( m ’) 


(3.5.1) 



Figure 3.4.2 One Optimal Decoding Scheme 

In the first decoder (in Fig 3.4.2), the sequence R* consists of the channel values 
Xk and y ik- Because the encoder is systematic the transition probability 
p(xkldk=i,Sk=m,Sk-i= m’) in Yi(Rk, m\ m) (from 3.3.1) is independent of the state 
value of the encoder. Being independent of states means that the summations 
over m and m’ (the current and previous states) will have no effect on it. What 
this means is that this can be factored out in the numerator and denominator of 


(3.5.1). Now 
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A 1 (dk) = log— — ~ — + , °g 

X X ?o (y u . ■ m ’> m ) * Pk (**) * a k - 1 ("»’) 


P(*t i^=n 
PU* K =°) 


(3.5.2) 


This can be expressed more concisely as 

Al(d k ) = Wi k + (2/C 2 ) * x k (3.5.3) 

where W| k is the logarithm of the quotient of the summations in (3.5.2). Notice 
that the y term in (3.5.2) depends on yu, not the systematic term x k . So W [k = 
(Al(dk) I x k = 0}. The a and (3 terms are still built with systematic terms as well 
as the parity information. The x k is multiplied by (2/a 2 ) in (3.5.2) because x k is 
Gaussian with mean +/- 1 and variance a 2 . This shows that W lk is the 
information produced using the structure of RSCC1 (it is the information output 
from DEC1 that depends on memory). 

Now Al(dk) will act as the systematic information in the input to second 
decoder. The output of the second decoder will be 

A2(dk) = W 2k + f(Al(dk)) (3.5.4) 

with W 2k defined similarly to W lk in (3.5.3). f(*) is some function of Al(d k ). 

is a function of the sequence y 2k and uses a priori information from the 
sequence (Al(dk)}. Because of interleaving between decoders W 2k is only weakly 
correlated with x k and y ]k (the hope is that it is independent of (Al(d k )}). This 
means that a new decoding process can take place with x k , yik and using W 2k as a 



32 


priori information in DEC1 after the first decoding iteration has occurred. [4] sets 
z k = \V 2 ic and assumes that it can be approximated by a Gaussian random variable 
with a variance of o z 2 (the variance of o z ~ must be estimated at every iteration). 
After the first iteration the output of DEC1 will be determined by x k , yi k . and z k 
and will be equal to 


Al(dk) = w, k + (2/a 2 ) * x k + (2/a z 2 ) * z k 


(3.5.5) 

In (3.5.5) the W !k term has used x k , y Jk , and z k to build a and (3 (as the a priori 
information). Now since z k has been built by DEC2 it cannot be reused as input 
information for DEC2. This means that (2/a z 2 ) * z k must be subtracted off after 
decoding has been done. The decoder structure is shown in Figure 3.4.2. 


3.6 Method 2 for the Decoding of Turbo-codes 


The first method of decoding Turbo-codes involved passing the 
information to the second decoder that was obtained from both the systematic 
sequence and the first parity sequence. The second method involves sending the 
systematic sequence directly (after interleaving) and also using the a priori 
information directly. The output of either of the MAP decoders in this method is 
split into three parts in a manner similar to (3.5.5). The result is 
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A(d k ) = log — 


IX >’* • ■ m ’’ m) * & * ° k -' (m,) p{x k I d k = 


log 


1) 


X X k> ( y* • 1 m> * ■ m) * A (m) * a *-< (m,) 


P (*4 !</* =0) 


+ L(d k ) 


(3.6.1) 


L(d k ) is set to zero for the first iteration of the first decoder. After that L(d k ) is the 
a priori information generated by the previous decoder (i.e. the log of the 
summation of products). This means L(d k ) is generated by the parity information 
from the previous decoder. The systematic information is interleaved (or 
deinterleaved) and passed to the next decoder separately. The use of L(d k ) in 
decoding comes in considering the value of 

Yi(x k , yik, L(d k ), m', m) = Pr(x k ld k = i,S k = m,S k . ( = m')* Pr(y k I d k = i,S k =m,S k .i = 


m')* Pr(d k = i lS k = m,S k .i = m')*Pr{S k = ml S k _i = m',L(d k )} (3.6.2) 

Pr(d k = i IS k = m,S k _i = m') is either zero or one depending on whether there is an 
output i associated with a state transition from m’ to m. With Pr{S k = ml S k .| = 
m’,L(d k )} the use is made of the information from the previous decoder. L(d k ) 
was generated as the log of the summation products from the previous decoder. 
This means that L(d k ) is equal to log(Pr(d k =l)/Pr(d k =0)) using information 
generated by the previous decoder. By exponentiating L(d k ) and using the fact 
that Pr(d k =l) + Pr(d k =0) = 1 its’ value can be given as follows 

e Mt) 

Pr{S k = ml S k .) = m’,L(d k )} = — (3.6.3) 

l + e u,) 


if the state transition from m’ to m determines a 1; and 
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e Lidt) 

Pr{S k = ml S k ., = m\L(d k )} = (3 ' 64) 

if the state transition from m’ to m determines a 0. In plain language what this 
gives is the probability that a bit is one or zero depending on the information 
generated from the previous decoder. The decoding scheme used in this case is 
shown in Figure 3.6.1. The advantage of this method is that no variance estimate 
is required. For this reason I used this decoding method in my decoder. 


With either method the number of iterations can be determined by 
knowing the number of iteration needed to achieve the BER required. 



Figure 3.6.1 The Second Optimal Decoding Method 












Chapter 4 
Results 


In Turbo-coding there are several components (i.e. random interleavers, 
RSCC’s, and decoders), each with different parameters. Even separately these 
components can be difficult to analyze. Several papers have helped in the 
separate analysis of both the interleavers and the RSCC’s [8],[ 11]. One of the 
important results claimed in [11] is that the interleaver size is the most important 
factor in determining the performance of Turbo-codes and that BER performance 
is inversely proportional to the size of the interleaver for large enough, random 
enough interleavers. This is important because it allows for testing of other 
components somewhat independently of the interleaver. For this reason only one 
interleaver was tested. The implementation of the interleaver is given in section 
4.1. 

The MAP decoding algorithm is used in the simulation. The reason for 
this is that this will give the best possible performance. Also simpler, more 
memory efficient versions of the MAP algorithm are becoming available [15]. 
The second decoding method, described in 3.6, was used because the variance at 
the output of the second decoder did not need to be estimated. 


35 



36 


In the simulations there were no zero bits tacked on to the end of each 
block and the final state of the encoder was unknown. This did not seem to 
degrade performance. Most of the decoding was done for a maximum of 18 
iterations. This is because it was done this way in [4] and is considered a 
benchmark for my research. 

The first requirement was to test the memory 4 generators to determine 
which produce the best BER curves. Memory 4 codes are generally used because 
they can generate very good performance. Higher memory generators do not 
generally add much performance gain and the decoding process is much more 
complex (remember that decoding complexity and memory requirements increase 
by more than a factor of 2 for every memory element added). Section 4.2 will 
give the simulation results of memory 4 RSCC’s concatenated in a Turbo-code 
scheme. 

The next consideration is the reduction of decoder complexity while 
maintaining good performance levels by reducing the memory for RSCC1 and 
using the standard memory four RSCC2. Because the decoding complexity of 
each (MAP) decoder grows exponentially with encoder memory the complexity 
of a Turbo-code with memory 4 RSCC2 and memory 3 RSCC1 is approximately 
75% of the complexity (ignoring the interleaving and deinterleaving operations, 
which in any case are just reading and writing operations). For memory 4 RSCC2 
and memory 2 RSCC1 the complexity is about 5/8 of the standard. This analysis 
assumes the same number of iterations for both decoding structures being 
compared. If the performance is not degraded significantly then the savings in 
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decoding complexity can be a significant factor. Section 4.3 will give the 
simulation results of the concatenation of 2 different RSCC s, one with a smaller 
memory. 

The next idea that was considered was observing the effect of reducing the 
rate of the Turbo-codes by sending all parity bits and rejecting puncturing. Using 
lower rate codes can result in power savings at the expense of extra bandwidth. In 
cases when power is limited it is important to know how well Turbo-codes can 
perform without puncturing. Section 4.4 will give the simulation results of a rate 
1/3 Turbo-code. 

Section 4.5 will give the simulation results of a Turbo-code where noise 
variance was measured inaccurately. This is done because the MAP algorithm 
requires an estimate of the noise variance. If Turbo-codes were to decode poorly 
because of a small error in the noise variance estimate then they would be of 
almost no practical use. These simulation results will show how much 
performance is degraded by some poor estimates. 

Of course this research has not closed the book on Turbo-codes. Section 
4.6 will give ideas for further research. 


4.1 Interleaver Implementation 

The interleaver algorithm used in this simulation is implemented as 
follows [10]: for an M*M memory (where M is 256, hence there are 65536 
bits/block) the bits to be interleaved are read into a square matrix. If i and j are 
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the addresses of the line and column for writing, respectively (with the first line 
and column being labeled line 0 and column 0 respectively) and i r and j r are the 
line and column for reading respectively, then the rule for reading is 
i r = (M/2 + l)(i + j) mod M 
E = (i+j) mod 8 

j r =[P(E)*(j + l)]-lmodM 

where P(E) is a function of E that is relatively prime with M and is a function of 
the line address (i + j) mod 8 

P(E) is given as follows: 

P(0) = 17; P(l) = 37; P(2) = 19; P(3) = 29; 

P(4) = 41; P(5) = 23; P(6) = 13; P(7) = 7; 

4.2 Memory 4 Generators 

5 different generators have been considered. The first is a 27_3 1 encoder 
which is shown in Figure 4.2.1 with results shown in Figure 4.2.2. The 27_31 
circuit had the best BER curves after 1 8 iterations. For this reason iterations were 
continued beyond 1'8 to determine how well it would perform. This code decoded 
below BER 10" 5 at .65 dB after 28 iterations. Although the number of iterations is 
very large, it may be worth it if power is a constraint in a given application. The 
BER of the 27_3 1 code after 1 8 iterations was used as the reference against other 
Turbo-encoders tested. The dashed line in the BER curves is the result of the 


27_31 after 18 iterations. 
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Next a 23_35 encoder was tested [8]. This encoder is shown in Figure 4.2.3 with 
results shown in Figure 4.2.4. This RSCC has the best distance properties. It can 
be seen that BER curves are not as good as the 27_31 code after 18 iterations. 
However the BER after 1 and 2 iterations is better than the 27_31 code. What this 
seems to show is that this encoder may perform better asymtotically at higher 
Eb/N 0 . 

The next two generators were given in [9]. This required the FB portion 
of the encoding circuit to be a primitive polynomial while the FF portion of the 
circuit should be chosen to minimize BER using certain criterion. Two generator 

polynomials given in that paper were 3 1 27 and 3 1 33 generators. Of these 

two, only results of the 31_27 encoder, which is shown in Figure 4.2.5 with 
results shown in Figure 4.2.6, are given. This is because the generators were 
obtained by the same method and the results are similar. The BER curves of these 
circuits are very similar to the BER curves obtained by the 23_35 circuit. Both 
are approximately .1 dB away from the 27_31 circuit after 18 iterations at BER 
10' 5 and both of them have steeper dropoffs at higher Eb/N 0 . 

Finally the original circuit used in [4] which was a 37_21 circuit, shown in Figure 
4.2.7, was tested. Results are shown in Figure 4.2.8. This circuit performed 
better than any circuit at low Eb/N 0 after many iterations with the exception of the 
27_31 encoder. 

The 37_21 RSCC and the 27_31 RSCC’s were chosen arbitrarily while the 
other RSCC’s were chosen based on analytical techniques. The 23_35 RSCC was 
determined based on distance properties and not on how it would perform in a 
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Turbo-code scheme. It was not necessarily expected to perform well as a Turbo- 
code. However the 31_33 RSCC and the 31_27 RSCC were designed to be 
optimal in a Turbo-code scheme. What this analysis has shown is that the 
RSCC’s that are selected based on the analytical techniques may not perform the 
best at very low Eb/N 0 . From the results of the simulations completed here it 
appears that the best memory 4 encoder obtained so far is the 27_31 RSCC but 
this does not mean that better RSCC’s will not be found. Better analytical 
methods need to be found for generating good RSCC’s to remove any doubt as to 


which RSCC will perform best. 



Pi 
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Figure 4.2.2 Performance for 27_31 Code Turbo-code Scheme 




BER 


Figure 4.2.3 23_35 Generator Circuit 



Figure 4.2.4 Performance for 23_35 Code Turbo-code Scheme 
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4.3 Lowering Decoder Complexity 

The next consideration is the reduction of decoder complexity while 
maintaining good performance levels by reducing the memory for RSCC1 and 
using the standard memory four RSCC2. It was shown in [11] that RSCC 1 
should be the encoder with reduced memory. 

The smaller memory generators that were used were obtained from [8]. 
The 7_5 circuit is shown in Figure 4.3.1. The results of the 7_5 RSCC1 
concatenated with the 27_31 RSCC2 are shown in Figure 4.3.2. A closeup of 
these results is shown in Figure 4.3.3 to highlight the differences between the 

curves. The 15 17 circuit is shown in Figure 4.3.4. The results of the 1 5_ 1 7 

RSCC1 concatenated with the 27_31 RSCC2 are shown in Figure 4.3.5. A 
closeup of these results is shown in Figure 4.3.6. 

As can be seen in the Figures the loss in coding gain is not very much. 
For decoding at 10' 5 the loss in power is only .12 dB and .10 dB for memory 2 
and 3 RSCC1 respectively concatenated with the memory 4 RSCC2. At 10' 4 the 
difference was even less pronounced, with losses of only .07 and .04 dB. In many 
cases it seems this would be a fair tradeoff given the reduced decoding 
complexity. If decoding complexity is a problem the smaller memory should be 
used since the difference in power savings is not significant between the two. 
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Figure 4.3. 1 A 7_5 Generator Circuit 



Figure 4.3.2 BER Curve for a Concatenated 7_5 Generating Circuit 
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Figure 4.3.5 BER Curve for a Concatenated 15_17 Generating Circuit 



Eb/No (in dB) 


Figure 4.3.6 Closeup BER Curve for a Concatenated 1 5 17 Generating Circuit 
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4.4 Lower Rate Turbo-codes 

It was suggested in [11] that unpunctured Turbo-codes might not perform 
as well as punctured Turbo-codes. To determine the validity of these claims 
simulations were done on an overall rate 1/3 turbo code with results shown in 
Figure 4.4. Since the Shannon limit at rate 1/3 is -.55 dB the results are very good. 
They decode at only .65 dB away from the Shannon limit in only 14 iterations. 
This is the same distance away from the Shannon limit as the punctured codes 
after 28 iterations. The tradeoff is increased bandwidth requirements which may 
not be a problem in some applications. 



Figure 4.4 Performance of 27_31 RSCC’s Without Puncturing. 
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4.5 Inaccurate Noise Variance Measurement 

Finally the effect of inaccurate noise variance measurement on the decoder 
was observed. The effect of underestimating the variance is given in Figure 4.5.1 
with the results of an overestimate of the variance given in Figure 4.5.2. From 
these Figures it can be seen that an error of 20% either way in the estimate of the 
variance will result in approximately a .1 dB loss. Of course the worse the 
estimate is, the worse the decoding performance will be. This seems to be a 
reasonable amount of loss. This shows that the MAP algorithm is not terribly 
unstable for inaccurate noise variance measurements. 



Figure 4.5.1 Underestimating Variance 
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4.6 Further Research 

Some of the questions about Turbo-codes that are still unanswered at this 
time will now be presented, some of which were posed in [11], 

It has been found that the MAP algorithm used with Turbo-codes 
approaches analytical bounds given in [11] after many iterations. One question is 
whether suboptimal decoding algorithms, such as the log-MAP algorithm and the 
Soft Output Viterbi Algorithm (SOVA), will also converge to same levels. Also 
the complexity of these algorithms versus the optimal MAP algorithm needs to be 
analysed. Perhaps two of these algorithms could be used for decoding, first 
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decoder being a less complicated one for the first few iterations and the MAP 
algorithm as a “clean up” type of decoder that eliminates the residual error. 

While it has been shown that it is not hard to obtain a good large size 
interleaver it remains to be seen whether an analytical device can be found that 
will give an optimal interleaver for a given interleaver size. Also the analysis of 
the optimal interleaver for a small interleaver still has not been completely solved. 

Multi-dimensional Turbo-codes have also been investigated. Multi- 
dimensional Turbo-codes are codes that are encoded by sending the systematic 
information and sending the information through multiple interleavers to be 
encoded through multiple RSCC’s 

The combined modulation and coding technique, Trellis Coded 
Modulation (TCM) provides good coding gain as well as bandwidth efficiency. 
Combining the ideas of Turbo-codes and TCM was begun in [16]. 
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Appendix A 

Properties of Nonrandom Block Interleavers 

Some analysis of the distance properties of nonrandom block interleaved 
sequences will now be given. This will show that some low weight input 
sequences (i.e. input weight 2 or 3) will produce output words that have a high 
output weight and who’s output weight increases for larger interleavers. This is a 
good result because the goal of encoding Turbo-codes through an interleaver is to 
boost the output weight for sequences that would produce a low weight codeword 
through a single RSCC. However the analysis will also show that nonrandom 
block interleavers produce too many low output weight codewords that are not 
affected by interleaver size for input weight 4. This will show that nonrandom 
block interleavers do not adequately “randomize” the output from RSCC2. This 
analysis will follow Berrou closely [10] 

Consider the Turbo-encoder shown in Figure 1.3. To simplify analysis and 
to give some concrete numbers to observe, the RSCC generator will be a 23_35 
(octal) punctured encode which is shown in Figure 4.2.3. Those sequences that 
produce finite weight outputs of both RSCC’s and have a finite weight input 
sequence are called global finite codewords or FC patterns. Some FC patterns 
with low output weight will be shown. 

Consider a large, M*M nonrandom block interleaving matrix (assuming M 
is a power of 2). Information bits are read in through the rows and read out 


55 



56 


through the columns. By assuming the matrix is filled with only a small number 
of ones and the rest of it is filled with zeros the analysis is greatly simplified. 
Because of the recursive nature of the codes at least 2 information bits being one 
is necessary for a FC to be produced. The RSCC’s repeat every 2 — 1 bits for an 
m memory code. With d representing a systematic sequence with weight w and 
pi and p2 representing the parity information generated by RSCC1 and RSCC2 
respectively, distance from the zero codeword of the FC is given as 
distance(w) = w + distance p] (w) + distance p2 (w) (A. 1 ) 

w is the weight given for the systematic portion of the output and distance pl (w) is 
the weight given for the punctured output of RSCC1 with the input being d k . 
distance P 2 (w) is the weight of the punctured output of RSCC2 with the input being 
the interleaved version of d k . Puncturing is done by transmitting pi k only at odd 
times k (k = 1, 3, 5, ...) and p 2 k at even times k. 

For an input weight of 2, distance(2) can be given if distance p i(w) is 
assumed to be generated by the minimum distance between bits that will produce 
a FC (for a memory 4 encoder the distance between 2 one’s that will cause a finite 
output weight is 15 because a RSCC repeats itself after 2 m -1 bits meaning that 
distance p i(w) is 4). Now 

distanced) = 2 + 4 + INT((15*M + l)/4) (A. 2) 

The final term is generated by assuming that the (15*M +l)/2 symbols output 
from RSCC2 are 1, half of the time. For the size of interleaver used in the 
simulation (M = 256), distance(2)would be about 966. This shows that M is the 
main factor for the output weight for large interleavers with weight 2 input 
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sequences. In fact it has been shown [11] that increasing the size of the 
interleaver in a Turbo-code scheme by a factor of N will decrease the BER by a 
factor of 1/N. This means that if an interleaver size of 100 bits in a Turbo-code 
scheme generates BER 10' 3 at a given Eb/N 0 then an interleaver size of 1000 
should generate BER of 10" 4 . 

For an input d k with weight 3 some of the patterns that can cause a FC can 
be seen by tracing the output on the state diagram for three inputs that are one’s, 
but they are not easy to catalogue. It might be assumed once again that the 
distances are similar to the case of 2 l’s because the finite codeword output from 
RSCC2 will still be several times M long. This means that weight 3 input 
sequences will produce output weights that will increase with larger interleavers 
and therefore give better performance. 

For higher input weight sequences the analysis comes down to viewing the input 
as the separate combination of several lower weight codewords. For example an 
input of weight 4 can be viewed as an input of 2 weight 2 codewords. The 
minimum output weight for input weight 4 is when global FC is interleaved with 
the input at the comers of a square with l’s on the comers (Fig. A.l). The 
minimum output weight for this is given by 

d(4) = 4 + 2* min{ distance p i(w)} + 2*min{ distance P 2 (w)} 

= 4+8 +8 = 20 (A. 3) 

Also notice that any rectangular input pattern with weight 4, and with distances 
between ones that are a multiple of 15 will cause a FC. What this example shows 
is that with a block interleaver the output weight of both RSCC codes may be 
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small. The desire is to map most codewords into medium weight codewords. It is 



10000000000000001 

00000000000000000 

00000000000000000 


00000000000000000 

10000000000000001 


Figure A. 1 An Input Pattern That Will Cause a Global FC 


that are more random could stand a better chance of mapping those low weight 
output sequences from RSCC1 into higher weight output sequences of RSCC2. 
What is desired when data is interleaved is the maximum scattering of data and 
also the maximum amount of disorder in the interleaved data. 

Some of the difficulties in determining good random interleavers are 
these: How can it be determined that an interleaver that does a good job breaking 
up, say w = 4 inputs like the one in Figure 2.3.2 will not create more code words 
with low weights for w = 2? Also the complexity must be limited due to the 
many times data must be interleaved and deinterleaved in a decoding operation. 

For higher weight inputs analysis becomes more difficult due to the fact 
that the inputs can be viewed as combinations of other patterns of codewords. 
However it seems that as long as the interleaver used does not have too much 
structure ( i.e. a block interleaver) it should work well enough in a Turbo-code 


scheme. 



Appendix B 

Flowcharts For Simulation Program 
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Appendix C 
Program Listing 


/* Simulation Program */ 

/* This runs an entire simulation of a turbo coding scheme. It calls 
functionsdecout (out , trans , numstates , sysreal , sysfb, parity , N) . N is noise, 
numbits is the number of 'bits decoded per block, which should be defined in 
here. In mmt.h we have all the memory allocation tricks and gasdev{) which 
is a gaussian random number generator. The interleavers 
which are of the form interfloat ( Mata, M) where M is the root of the 
sizeof the interleaver (square root of numbits) */ 

/ * To run change filename to dump output to, generator polynomial, EbNo 
memory and number of state */ 

#include <stdio.h> 

#include <math.h> 

#include "mmt.h" 

#include "header. h" 

void main (void) 

{ 

FILE * ini ; 

int **gl,**g2,i,j, k, prevstate , numbits = 163 84 , numblocks =150 ; 
int * state , in=0 , meml=4 , mem2=4 , numstatesl=16 , numstates 2 =16 ; 
/*numstates has to be size 2^mem */ 

int * *outl , * * transl , **out2 , * * trans2 ; /* These give output and 

transition information about the encoders */ 

int *d, numerr=0 , stat , M=12 8 , numits=18 , f ile_num_errs [ 28 ] = { 0 ) /*must be 
size numits + 10 * / , * *intoint , * *outof int ; 

float N, std, *dcorrupt , *pl f *p2 , * intrinsic , EbNo= . 8 , rate= . 5 , max =2 ; 
/*rate is .5 because of puncturing. EbNo is in dB */ 
float *sysfb, * *alfal # **betal; 
float * *intofloat ,* ^outof float ; 
double x; 

1 ong i dum [ 1 ] = { 0 ) ; 

/* i,j, ir,jr are indexes that stand for inputs to the interleaving 
matrix and reading from interleaving matrix. */ 

ini = fopen( "3127 . txt " , "a+t H ) ; /* this is the name of the file it 
will be stored in */ 

/* gl and g2 are generator matrices that help create outl,out2, 
transl , trans2 with prevstate, *state */ 

/ * meml and mem2 are the memory for gl,g2. numstatesl = 2' N meml 
numstates2 = 2 ys mem2 */ 
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/* d is the information bits which create dcorrupt, pi (parity bits 
from the first generator), p2 (likewise for the interleaved info} v 
/* EbNo is given for rate 1/2. numbits is M*M */ 

/* The other variables are used in generating the information */ 

/* time to allocate memory for **all** the variables, from mmt . h V/ 
outl = int_matrix_2d(numstatesl,2) ; 
transl = int__matrix_2d {nums tatesl , 2 ) ; 
out2 = int_matrix_2d(numstates2 , 2 ) ; 
trans2 = int_matrix_2d (numstates2 , 2 ) ; 
state = (int *) calloc ( 1 , sizeof ( int )) ; 
d = (int *) calloc (numbits, sizeof (int) ) ; 
dcorrupt = (float *) calloc (numbits , sizeof ( float )) ; 
pi = (float *) calloc (numbits , sizeof ( float )) ; 
p2 = (float *) calloc (numbits, sizeof (float) ) ; 
intrinsic = (float * ) calloc (numbits , sizeof ( float )) ; 
sysfb = (float *) calloc (numbits , sizeof ( float )) ; 
alfal = f loat_matrix_2d (numstates2 , numbits+1 ) ; 
betal = float_matrix_2d (numstates2 , numbits+1 ) ; 
intofloat = f loat_matrix_2d (M, M) ; 
outoffloat = f loat__matrix_2d (M, M) ; 
intoint = int_matrix_2d (M, M) ; 

outofint = int_matrix_2d(M,M) ; /* allocating memory */ 

/* converts EbNo to a noise variance */ 

N = (2) / ( (float) ( (2 .0 * rate * { float ) (pow ( 10 , EbNo/ 10 ))))) ; 
std = sqrt (N/2 ) ; 

/* pr int f ( " EbNo = %f variance = %f \n" , EbNo , N/2 ) ; */ 
gl = int_matrix__2d ( 2 , meml + 1 ) ; /* allocating mem for gl */ 

gl [ 0 ) [ 0] =1 ; gl [0] [1] =1; gl [0] (2] =0 ; gl[0][3]=0; gl [0] [ 4 ] =1 ; 

gl [ 1 ] [ 0] =1 ; gl [1] [1] =0; gl[l][2]=l; gl [ 1] [3 ] =1 ; gl [ 1 ] [ 4 ] =1 ; 
g2=int_matrix_2d ( 2 , mem2+l ) ; /* allocating mem for g2 */ 

g 2 [ 0 ] [ 0 ] = 1 ; g2[0][l]=l; g2[0](2]=0; g2[0][3]=0; g2[0][4]=l ; 
g 2 [ 1 ] [ 0 ] = 1 ; g2 [1] [1] =0; g2 [ 1 ] [ 2 ] =1 ; g2[l][3]=l; g2[l][4]=l ; 

/* create output and transition matrices */ 
for ( in =0 ; in<=l ; in++ ) { 

for (prevstate =0 ; prevstate<=numstatesl-l ; prevstate++ ) { 
state [0] = prevstate; 

outl [prevstate] [ in] = encode (gl , in, state, meml ) ; 
transl [prevstate] [ in] = state [ 0 ] ; 

} 

) 

for (in =0 ; in<=l ; in++) { 

for (prevstate =0 ; prevstate<numstates2 ; prevstate++ ) { 

*state = prevstate; 

out2 [prevstate] [ in] = encode (g2 , in, state , mem2 ) ; 
trans2 [prevstate] [ in] = *state; 

} 

} 

/********************* START SIMULATION *★*★★*★*★★**★★★****★*★*★**★****/ 
f or ( k=0 ; k<numblocks ; k++ ) { 

for ( i=0 ; i<numbits ; i++ ) { /* making info bits */ 

d [ i ] = ( int ) ( uniform ( ) + . 5 ) ; 

dcorrupt [i] = 2 * (( float )( d [ i ])) -1 + s td*gasdev ( idum) ; 

} 

for (i=0; i<numbits ; i++) {sysfb [ i] =0 ; } 
stat = 0; 

f or ( i = 0 ; icnumbits ; i + + ) { 


/* making pi bits */ 
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pl[i] = ((float) <outl[stat] [d[i))))*2 -1 + std*gasdev ( idum) ; 
stat = transl [stat] [d[i] ] ; 

} 

f or ( i=0 ; i<=numbits-l ; i++ ) if{i%2 != 0 ) { pi [ i ] - 0.0;} /* puncturing 

pi V 

inter int ( d , M, into in t , outof int ) ; /* interleave to make p2 bits 

*/ 

stat = 0; 

for (i=0;i<=numbits-l;i++) { 

p2[i] = ( (float) (out2 [stat] [d[i] ])) *2-1 + std*gasdev { idum) ; 

stat = trans2 [ stat ] [d [ i ] ] ; 

} 

f or ( i=0 ; i<=numbits-l ; i + + ) if(i%2!=l){ p2[i] = 0.0;} /* 

puncturing p2 */ 

de inter int ( d , M , intoint , outof int ) ; 

f or ( i=0 ; i<=numbits-l ; i++ ) { /* truncate to prevent overflow */ 

if (dcorrupt [i] >max) (dcorrupt [i] =max; } 
if ( dcorrupt [ i ] <-max) (dcorrupt ( i ] =-max; } 
if (pi [ i ] >max) (pi [ i ] =max; } 
if (pi [i]<-max) (pi [i]=-max; } 
if (p2 [ i ] >max) (p2 [ i ] =max; } 
if (p2 [ i ] <-max) (p2 [ i] =-max; } 

} 

numerr= checkerr (d, dcorrupt , numbits) ; /* see how many errors there 
are originally */ 

f ile_num_errs [ 0 ] += numerr; 

printf(" number of errors for %d bits after 0 iterations is %d 
\n" , numbits , numerr ) ; 

printf(” error percentage = %f 
\n" , ( (float ) numerr ) / ( ( float) numbits ) ) ; 

for (i=l; i<=numits; i++) { /* turbo decoding process useing process from 

Robertsons paper */ 

decoutl (numbits , outl , transl , numstatesl , dcorrupt , sysfb, pi , N, alf al , betal ) ; 
/*first decoder uses pi to build info. Output of decoder is in sysfb */ 
f or ( j=0 ; j<numbits ; j ++ ) { intrinsic[j] = sysfb[j];} /* stores 
output of first decoder for errorchecking purposes */ 

interfloat ( sys fb, M # intofloat , outof float ) ; /* interleave inputs to 
dec 2 */ 

inter float (dcorrupt , M, intofloat , outof float ) ; 

decoutl (numbits , out 2 , trans2 , numstates2 , dcorrupt , sysfb, p2 , N, alf al , bet 

al); 

/* output of dec2 is built by p2 . Again output of this decoder 
is in sysfb*/ 

de inter float ( sysfb, M, intofloat , outof float ) ; 
deinterfloat (dcorrupt,M, intofloat, outof float) ; 

for ( j=0 ; j<numbits ; j++) { intrinsic [ j ] = intrinsic [ j ] + sysfb[j) 
+ (2/ (N) ) * dcorrupt [ j ] ; } 

numerr=checkerr (d, intrinsic , numbits );/ * checking number of 

errors */ 

if (numerr == 0 ) { i=numi ts+1 ; } 
f ile_num_errs [ i ] += numerr; 

print f ( "number of errors for %d bits after %d iterations is %d 
\n" , numbits , i , numerr) ; 
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printf ( " error percentage = %f 
\n" , ( { float )numerr) / { ( float ) numbits ) ) ; 

} 

} 

/* print results to a file */ 
fprintf ( ini , " gl is " ) ; 
fprintf ( ini , " \n" ) ; 

fprintf (ini, "%d %d %d %d %d \n%d %d %d %d 

%d" , gl [ 0 ] [0] ,gl[0] [1] , gl [ 0 ] [2] , gl [ 0 ] [3] ,gl[0] [4] , gl [ 1 ] [0] , gl [ 1 ] [1] , gl [ 1 ] [2 
] # gl [ 1 ] [3] , gl [ 1] [4] ) ; 
fprintf ( ini , " \n\n" ) ; 
f print f ( ini , M g2 is \n M ); 

f pr int f { ini , " %d %d %d %d %d \n%d %d %d %d 

%d“ , g2 [ 0 ] [0] ,g2 [0] [1] ,g2 [0] [2] ,g2 [0] [3] ,g2[0] [4] ,g2[l] [0] ,g2 [1] [1] ,g2 [1] [2 

] ,g2[l] [3] , g2 [ 1 ] [4] ) ; 

fprintf ( ini ," \n \n "); 

fprintf (ini , "Eb/No is %f \n" , EbNo) ? 

fprintf (ini, "number of blocks is %d \n" , numblocks ) ; 
fprintf ( ini , "number of bits/block is %d \n" , numbits ) ; 
for ( i=0 ; i<=numits ; i + + ) { 

fprintf ( ini , "number of errors for %d iterations is %d BER = %f 
\n" , i , f ile_num_errs [ i ] , 

{ ( float) ( f ile_num_errs [ i ] ) ) / ( ( float ) (numblocks* numbits ) ) ) ; 

} 

f close ( ini ) ; 

} 

/* MAP decoding function. */ 

/* function returns the estimate in sysfb */ 

void decoutl(int numbits, int **out,int **trans,int numstates, float 
*sysreal , float *sysfh, float *parity, float N, float **alfa, float ^*beta){ 
float bsysreal [ 2 ] , bpar [2 ] , bfb [2] , tempi , temp2 , mz=0 , probzero , probone ; 
int i,j,k,l,m; /* indexs */ 

/* alfa and beta follow bahl et.al. '73 */ 

/* bpar, bfb, and bsysreal are components of gamma, it is done this way to 
save processing time *1 

/* probone and probzero are temporary variables to get log likelihood 
value* / 

f or ( i=0 ; i<numstates ; i++) { /* initialize alfa, beta */ 
f or ( j=0 ; j<numbits ; j ++ ) { 
alf a [ i ] [ j] = 0; 
beta [i] [ j ] = 0; 

} 

} 

alfa[0][0] = 1.0; /* initialising alfa */ 

/* computes all alfa's */ 
for (i=0 ; i<numbits ; i++) { 

bsysreal [0] = exp ( - { ( sysreal [ i ] + 1) * (sysreal [ i ] + 1) ) /N) ; /* 

components for gamma */ 

bsysreal [1] = exp {-(( sysreal [ i ] - 1 )*( sysreal [ i ] - 1))/N); 
bpar [ 0 ] = exp (-( (parity [i] + 1 ) * (pari ty [ i ] + 1))/N); 
bpar [ 1 ] = exp (-( (parity [i] - 1 ) * (pari ty [ i ] - 1))/N); 
bfb [ 1 ] = (exp ( sysfb [ i] ) ) / ( 1+exp ( sysfb [ i ] ) ) ; 
bfb [ 0 ] = 1-bfb [ 1 ] ; 
tempi = 0 ; 
temp 2 = 0 ; 

for (m=0;m<numstates ;m++) { 
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for (1=0; 1<=1; 1++) { 

templ=alfa[m] [ i ] *bsysreal [ 1] *bfb [ 1] *bpar [out [m] [1] ] ; 
alfa[trans [m] [1] ] [i+1] += tempi; 
temp2 + = tempi; 

} 

} /* calculates alfa for the next i */ 

f or (m=0 ;m<numstates ;m++ ) { 

alfatm] [i+1] = alfa [m] [ i+1 ]/ temp2 ; /♦ normalize ♦/ 

) 

} /* alfa is done */ 

/* initialize beta at the last time ♦/ 
f or { i=0 ; i<numstates ; i++) { 

beta[i] [numbits] = 1 . 0/ (( float) (numstates) ) 

} 

for ( i=numbits ; i>0 ; i-- ) { /* recursively calculate beta ♦/ 

bsysreal[0] = exp (-(( sysreal [ i-1 ] + 1) * (sysreal [i-1] + 1))/N); /♦ 
components for gamma */ 

bsysreal [ 1 ] = exp (-{( sysreal [ i-1 ] - 1) * (sysreal [i-1] - 1))/N); 

bpar[0] = exp (-( (parity [ i-1 ] + 1 )* (parity [ i-1 ] + 1) ) /N) ; 

bpar[l] = exp (-( (parity [ i-1 ] - 1 )* (parity [ i-1 ] - 1) ) /N) ; 

bfb [ 1 ] = ( exp ( sys fb [ i-1 ] ) ) / ( 1+exp (sysfb [ i-1 ] ) ) ; 

bfb [ 0 ] = 1-bf b [ 1 ] ; 

tempi = 0; 

temp 2 = 0 ; 

f or (m=0 ;m<numstates ; m++ ) { 
f or ( 1=0 ? 1<=1 ; 1++ ) { 
t emp 1 = 

beta [trans [m] [1] ] [i] ♦bsysreal [1] *bfb[l] * bpar [out [m] [1] ] ; 
beta[m] [ i-1 ] +=templ ; 
temp2 += tempi ; 

} 

} /* calculates beta for the next i */ 

f or (m=0 ; m<numstates ;m++ ) { 

beta[m][i-l] = beta [m] [ i-1 ]/ temp2 ; /* normalize */ 

} 

} /* beta is done */ 

/* now to put it together to get approximation of output */ 

/* and put it in sysfb */ 

for(i=0;i< numbi ts ; i++ ) { 

bpar [ 0 ] = exp {-( (parity [i] + 1 ) * (pari ty [ i ] + 1))/N); /* 

components for gamma */ 

bpar [ 1 ] = exp(- ( (parity [i] - 1 )* (parity [ i ] “ 1))/N); 
probzero = 0; 
probone = 0 ; 

for (m=0;m<numstates;m++) { /* go through all the states */ 

for { j=0; j<=l; j++) { 
if (j==0) { 
probzero += 

alfa [m] [i] *beta[ trans [m] [0] ] [i+1] *bpar [out [m] [0] ] ; 

} 

else{ 

probone += 

alfa[m] [i] *beta [ trans [m] [1] ] [i+1] ♦bpar [out [m] [1] ] ; 

} 
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} 

) 

sysfb [ i ] = log (probone/probzero) ; 

for ( i=0 ; i<=numbits-l ; i++) { /* truncate to prevent overflow */ 

if { sysfb [i] >17 ) {sysfb [ i] =17 ; } 
if ( sysfb [ i ] <-17 ) { sysfb [ i ] =-17 ; } 


) 

} 

/* program to check errors */ 

int checkerr ( int *d, float * sys, int numbits){ 
int sum=0 , i ; 
f or ( i=0 ; i<numbits ; i++) { 

if (d [ i] == 0) { 

if (sys [i ] >=0) { 

SUH1++ ? 

} 

} 

else{ 

if (sys [ i ] < = 0 ) { 
sum+ + ; 

} 

} 

} 

return sum; 

} 

void interfloat ( float Mata, int M, float **into, float **outof ) { 
int i , j ; 

int p [8] ={ 17 , 37 , 19 , 29 , 41 , 23 , 13 , 7 } , inc , ir , j r , eps ; 

/* this is from berrou '95 */ 

inc =0; /* load into matrix */ 

for ( i=0 ; i<M; i++ ) { 

f or ( j = 0 ; j <M ; j + + ) { 

into[i] [j] =data [ inc++] ; 

} 

} 

f or { i=0 ; i<M; i++ ) { /* read out of the matrix */ 

f or ( j = 0 ; j <M ; j ++ ) { 

ir = ( (M/2 +1) * (i + j) ) %M; 
eps = (i+j)%8; 

jr = ( (pfeps] *( j+1) ) -1) %M; 
outof{i][j] = into [ ir ] [ jr ] ; 

} 

} 

inc=0; /* read it back into the data stream */ 

f or ( i=0 ; i<M; i++) { 

for ( j = 0 ; j <M ; j++) { 

data [ inc] =outof [ i ] [ j ] ; 
inc ++ ; 

} 

} 

} 

void deinterfloat ( float *data,int M, float **into, float **outof ) { 
int i , j ; 

int p[8]={17,37,19,29,41,23 f 13,7},inc,ir,jr, eps ; 
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/* this is from berrou '95 */ 

inc =0; /* load into matrix */ 

for { i=0 ; i<M; i++) { 

for ( j=0 ; j<M; j++) { 

outof[i][j] = data[inc++]; 

} 

} 

f or ( i=0 ; i<M; i++ ) { 

for ( j=0 ; j<M; j++) { 

ir = ( (M/2 +1) * <i + j ) ) %M; 
eps = ( i+ j ) %8 ; 
jr = ( (p [eps] * ( j+1) ) -1) %M; 

intofir] [jr] = outof [i] [j ] ; 

} 

} 

inc=0 ; 

for ( i=0 ; i<M; i++) { 

for ( j = 0; j<M; j++) { 

data [ inc ] = into [ i ] [ j ] ; 
inc++; 

} 


} 

} 

void interint ( int Mata, int M, int **into,int **outof ) { 
int i , j ; 

int p[8]={17,37,19,29,41,23,13,7}, inc , ir , jr , eps ; 

/* this is from berrou '95 */ 

inc =0; /* load into matrix */ 

for { i=0 ; i<M; i++ ) { 

for ( j =0 ; j<M; j ++ ) { 

into[i][j] =data [ inc++ ] ; 

} 

} 

for { i=0 ; i<M; i++) { 

for { j = 0 ; j <M ; j + + ) { 

ir = ( (M/2 +1) * (i+j ) ) %M; 
eps = ( i + j ) % 8 ; 
jr = ( (pfepsj * ( j+1) ) -1) %M; 
outof [i][j] = into [ir] [ jr] ; 

} 

} 

inc=0 ; 

f or ( i=0 ; i<M; i++) { 

for ( j =0 ; j<M; j ++ ) { 

data [ inc ] =outof [ i ] [ j ] ; 
inc++ ; 

} 

} 

} 


void deinterint (int *data,int M,int **into,int **outof){ 
int i , j ; 


int p [8] ={17 ,37, 19 ,29, 41, 23, 13, 7}, inc, ir,jr, eps; 

/* this is from berrou '95 */ 

■*’ nc 0 ; /* load into matrix 

for (i=0 ; i<M; i++) { 

for ( j=0; j<M; j++) { 

outof [i][j] = data [inc++] ; 


*/ 
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} 

} 

for ( i=0; i<M; i++) { 

for (j=0; j<M; j++) { 

ir = ( (M/2 +1) * (i+j ) ) %M; 
eps = ( i + j ) % 8 ; 
jr = ( (p [eps] * ( j+1 ) ) -1 ) %M; 
into [ir] [ jr] = outof[i][j]; 

} 

} 

inc=0 ; 

for ( i=0 ; i<M; i + + ) { 

for ( j=0; j<M; j++) { 

data [ inc] =into [ i ] [j] ; 
inc++; 

} 

} 

} 

int encode ( int **g,int in, int *state,int itiein) 

/* program to help generate output and state transition matrices, it takes 
the generator matrix, the input, and the state (in integer form) and 
returns the output value and the transition state (in state) . 


memory 

is size of number of delay units. To see how this is done look at 
berrou et.al. */ 

{ 

int i, k, a[4] ={0} ,b[4)={0} , c = 0,fb; 
k = state [ 0 ] ; 
binstat ( k, mem, a); 
c += in; 

f or ( i=l ; i<=mem; i++) { 

c + = a[i-l]* g[0] [i] ; /* determines feedback bit c */ 

} 

fb = c% 2; 
c = fb; 

for (i=l; i<=mem; i++) { 

c += a [ i-1 ] *g [ 1 ] [i] ; 

} 

c = c%2; 

f or ( i = 0 ; i<mem— 1 ; i + + ) { 
b [ i+1 ] =a [ i ) ; 

} 

b [ 0 ] = fb; 
first space */ 

state [0] = intstat (mem, b) ; 
return c; 

} 

void binstat (int k, int m, int *mvect) 

{ / * converts k into m bit row vector */ 

int i ; 

for ( i=m; i>0 ; i-- ) 


/* c is the outputed bit now. */ 

/* now to get the next state *1 

/* shifting previous state */ 


/* putting feedback bit into 


{ 


mvect [m-i] = (int) ( k / ( (int)pow(2 , i-1) ) ) ; 
k -= mvect [m-i] *pow ( 2 , i-1 ) ; 


} 

int intstat (int m, int *mvect) 
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{ /* converts a m bit row vector, *mvect, into an integer k */ 
int i,k=0; 
f or { i = 0 ; i<m? i++) { 

k += mvect [ i] * ( int ) (pow(2 , m-i-1 ) ) ; 

} 

return k; 

} 




